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Abstract 


Information  capacity  of  Gaussian  channels  is  one  of  the  basic 
problems  of  information  theory.  Shannon’s  results  for  white  Gaussian 
channels  and  Fano ’ s  "waterfilling"  analysis  of  stationary  Gaussian 
channels  are  two  of  the  best-known  works  of  early  information  theory. 
Results  are  given  here  which  extend  to  a  general  freimework  these 
results  and  others  due  to  Gallager  and  to  Kadota,  Zakai ,  and  Ziv. 
The  development  applies  to  arbitrary  Gaussian  channels  when  the 
channel  noise  has  sample  paths  in  a  separable  Banach  space,  and  to  a 
large  class  of  Gaussian  channels  when  the  noise  has  sample  paths  in  a 
linear  topological  vector  space.  Solutions  for  the  capacity  are 
given  for  both  matched  and  mismatched  channels. 


Introduction 


The  modern  theory  of  information  is  largely  based  on  the  pio¬ 
neering  work  of  C.E.  Shannon  [1].  The  contributions  and  importance  of 
information  theory  to  the  advancement  of  technology  are  very  well 
known,  and  need  not  be  summarized  here.  However,  new  applications  of 
a  different  nature  seem  likely  to  arise  in  the  not-far-distant 
future.  Some  of  these  potential  applications  would  require  a  much 
deeper  development  of  the  theory  than  has  been  needed  heretofore. 
This  is  in  part  because  of  rapid  advances  in  technology  in  areas  such 
as  computers  and  communications.  Thus,  one  may  envision  computers  of 
such  high  capability  that  their  optimum  use  will  require  mathematical 
models  using  infinite-dimensional  methods.  Fiber  optics  is  already 
leading  to  communication  channels  of  extremely  high  bandwidth.  Also 
to  be  considered  is  the  need  to  develop  information-theoretic  models 
and  methods  for  applications  which  do  not  fit  into  the  classical  mold 
of  a  communications  channel  with  stationary  Gaussian  noise  or  a 
discrete  memoryless  channel.  On  the  one  hand,  some  communication 
channels  contain  nonstationary  noise  as  a  major  source  of 
interference.  In  another  direction,  information  theory  is  viewed  as  a 
means  of  evaluating  and  designing  systems  in  areas  such  as  image 
processing,  artificial  intelligence,  and  surveillance. 

Thus,  the  scope  of  information  theory  as  presently  applied  may 
require  considerable  expansion  in  order  to  meet  the  needs  of  the 
future.  In  particular,  mathematical  models  may  be  needed  for  problems 
of  a  very  general  nature,  including  channels  with  memory,  which  may 
be  infinite-dimensional,  nonstationary,  and  possibly  nonGaussian. 
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The  present  article  gives  a  treatment  of  capacity  for  Gaussian 
channels  in  a  very  general  setting:  when  the  stochastic  processes  of 
interest  induce  measures  on  a  linear  topological  vector  space.  The 
work  is  an  extension  of  previous  results  for  induced  measures  on  a 
separable  Hilbert  space  [2],  [3].  Although  the  latter  model  will  be 
sufficiently  general  for  most  applications,  it  is  not  likely  to  be 
adequate  for  a  treatment  of  nonstandard  applications  such  as  random 
fields,  artificial  intelligence,  and  surveillance. 

In  the  case  of  stochastic  processes  with  sample  functions  be¬ 
longing  to  a  separable  Hilbert  space,  the  results  given  in  [2]  and 
[3]  represent  a  substantial  generalization  of  previous  work.  This 
previous  work  includes  Shannon’s  original  white  noise  channel  [1], 
Gal lager’s  further  work  on  this  model  [4],  Kadota,  Zakai,  and  Ziv’s 
work  on  the  Wiener  channel  [5],  and  the  results  of  Fano  [6]  and 
Gallager  [4]  for  stationary  Gaussian  channels.  All  of  this  prior  work 
makes  various  assumptions  on  the  channel  noise. 

Of  course,  in  practical  applications  the  coding  capacity  is  most 
important.  Partial  results  in  this  area  for  these  more  general  models 
have  been  obtained  [7],  [8].  It  can  be  expected  that  more  complete 
solutions  of  the  coding  capacity  problem  will  require  the 
availability  of  general  results  on  information  capacity  such  as  those 
summarized  here,  since  proofs  of  coding  capacity  typically  involve 
use  of  the  information  capacity. 

This  paper  discusses  the  general  framework  in  which  these  prob¬ 
lems  have  been  solved,  and  summarizes  the  solutions.  Proofs  will  not 
be  included:  it  will  be  seen  that  one  can  modify  the  proofs  of  the 
Hilbert  space  solutions  given  in  [2]  and  [3].  This  has  already  been 
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done  in  [7]  for  the  case  of  the  "matched"  channel  analyzed  in  [2], 
and  similar  methods  can  be  used  for  the  "mismatched"  channel  consi~ 
dered  in  [3].  Thus,  the  development  here  will  be  limited  to  defining 
the  framework  of  the  problem,  providing  the  supplementary  details 
needed  to  adapt  the  Hilbert-space  solutions  and  proofs  of  [2]  and  [3] 
to  the  present  more  general  setup,  and  then  stating  the  results. 


Mutual  Information  and  Channel  Capacity 

Let  (X,/3)  and  (Y,?)  be  two  measurable  spaces,  with  a 
probability  on  (XxY,  /3x3f) .  For  the  sake  of  clarity,  is  called  a 

joint  measure.  Denote  by  p^  and  p^  the  projections  of  p^  on  (X,/3) 
and  (Y,2f),  p^Sp^  the  product  measure  on  (XxY,  13x9^).  The  (average) 
mutual  information  of  p^  is  defined  to  be 

where  the  supremum  is  over  all  N  ^  1  and  all  measurable  jjartitions 
Ci,...,Cj^  of  (XxY,  /3x?) .  It  follows  immediately  that  I(p^)  =  oo  when 
it  is  false  that  p^  is  absolutely  continuous  with  respect  to 
(l^XY  •  When  M^®Py»  then  f9] 


i(mxy)  =  J  [log 

XxY  ^  ^ 


(2) 


Channel  information  capacity  is  defined  as  the  supremum  of 
over  all  p^  in  a  suitable  set.  In  the  framework  of  most  com¬ 
munication  channels,  and  to  be  used  here,  the  channel  model  is 
defined  as  follows.  A  measure  p^  on  (XxY,  px?)  describes  the  statis¬ 
tical  relationship  between  the  message  X  and  the  channel  noise  N; 
usually,  as  we  shall  assume,  p^  =  channel  output  Y  is 

described  by  a  measure  p^  =  °  g~^.  where  g  is  (XxY,  PxSf)/(Y,2f)- 
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measurable.  The  joint  measure  is  then  »  f  where 
^(x.y)  =  (x.  g(x,y)).  The  most  typical  situation  in  engineering 
applications  is  for  g(x,n)  =  A(x)+n.  where  A  is  an  (X.3f)/{Y,/3)- 
measurable  coding  function.  In  general,  the  capacity  is  then  defined 
as  where  Q  is  a  set  of  constraints  on  all  admissible 
pairs  (fi^.A)  of  message  measures  and  coding  functions  A.  However, 
if  A  is  1:1  and  bimeasurable.  then  no  information  is  lost  due  to  A. 

That  is.  let  Pgy  =  ^  0  Pj^]  o  h  where  h(x.y)  =  (x.  x+y) .  If  A 
_ My  _ Fc 

is  1:1  and  (X.p)  /(Y.^)  bimeasurable.  then  (1)  shows  that 
I(PsY^  ~  ^  and  Y  are  Polish  (complete,  separable, 
metrizable),  then  by  Kuratowski’s  Borel  mapping  theorem  [10]  any  1:1 
Bore 1 -measurable  map  A:  X  Y  is  Borel-bimeasurable. 

We  shall  assume  here  that  X=Y.  j3=?.  A=I  (identity),  so 
g(x.y)  =  X  +  y.  The  extension  to  the  more  general  case  can  be 
obtained  by  either  restricting  attention  to  coding  functions  A  which 
are  1:1  and  bimeasurable.  or  else  by  computing  the  information  lost 
due  to  a  coding  function  which  does  not  have  these  properties. 

Mathematical  Structure 

The  following  assumptions  will  be  made  henceforth.  E  is  a 
locally  convex  Hausdorff  linear  topological  vector  space  over  the 
real  numbers,  with  topological  dual  E' .  It  will  also  be  assumed  that 
E  is  quasi-complete:  every  closed  and  bounded  subset  is  complete.  E 
must  then  be  sequentially  complete.  a(E')  will  denote  the  cylin¬ 
drical  CT-field,  generated  by  the  elements  of  E',  cr(E' the  com¬ 
pletion  under  the  measure  p.  For  x  in  E  and  y  in  E' ,  the  value  of  y 
at  the  point  x  will  be  denoted  by  <y,x>. 
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The  noise  measure  will  be  defined  on  {E,  a(E')).  will  be 

assumed  to  be  Gaussian  and  zero-meeui:  ^  is  a  zero-mean  Gaussian 

distribution  on  IR  for  each  5  in  E' .  will  be  assumed  to  have  a 

covariance  operator  Rj^:  E'  -»  E.  is  linear,  self-adjoint  and 

nonnegative:  <x,Rj^y>  =  <y.R^x>  and  <x,Rj^x>  >  0  for  all  x.y  in  E' . 

has  characteristic  function  given  by  p^(x)  =  J  e^^^’^^dp^(y)  = 

E 

-t<x,Rj^x> 

e  ,  and  <x,Rj^y>  =  J'g  <x,u><y  ,u>dpj^(u)  . 

Under  these  assumptions,  it  is  known  [11]  that  there  exists  a 
unique  Hilbert  space  contained  in  E,  such  that  the  natural 

(canonical)  injection  j^^:  ^  E  is  continuous,  Rj^  =  jjjjJJ,  and  is 

the  closure  of  range(R)  under  the  inner  product  <Ru,Rv>j^  =  <u,Rv>. 
Here,  Hj^  is  always  identified  with  Hjj.  Hj^  is  termed  the  reproducing 
kernel  Hilbert  space  (RKHS)  of  R^^  (or  Pj^) ;  it  is  actually  the  RKHS 
for  the  covariance  function  R^:  E'xE'  -»  K,  Rq(u,v)  =  <u,Rv>.  It  will 
be  further  assumed  that  H^^  is  separable;  instances  where  this  assump¬ 
tion  is  not  necessary  will  be  noted.  If  p^^  is  Radon,  then  Hj^  is 
necessarily  separable  [12]. 

The  message  measure  p^  is  a  probability  on  (E,  a(E')).  The 
constraints  to  be  imposed  will  ensure  that  p^  has  a  covarisince 
operator  R^:  E'  E;  it  can  be  assumed  (WLOG)  that  p^^  has  zero  mean. 
As  in  the  previous  section,  the  measure  of  interest  is  p^,  defined 
by  p^  =  ®  f  where  f(x.y)  =  (x,  x+y) . 

A  basic  result  in  the  Shannon  theory  is  that  if  the  supports  of 
p^  and  p^  are  restricted  to  be  of  finite  dimension  and  the  covariance 
of  p^  is  fixed,  then  I(p^)  is  maximized  when  p^  is  Gaussian.  From 
this  one  obtains  the  result  that  the  channel  capacity  problem  can  be 
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solved  by  assuming  to  be  Gaussian  (see  [2,  Lemma  6]).  This 

assumption  will  be  made  henceforth. 

The  observation  measure  °  g~^ .  where  g(x.y)  =  x  +  y. 

Of  course,  Ry  has  a  RKHS  contained  in  E  and  where 

Jy"  Hy  -♦  E  is  the  natural  injection  and  is  continuous. 

The  joint  Gaussian  measure  has  a  joint  covariance  operator 
E'xE'  -♦  ExE  [13],  [7].  This  operator  and  its  properties  are 
characterized  by  the  following  result.  It  does  not  require  that  be 
separable.  Moreover,  the  result  holds  for  any  joint  Gaussian  measure 
on  (ExE,  cr(E' )xct(E' ))  having  a  covariance  operator  9Ly*  E'xE’  -»  ExE. 


IS 


tnus  t^aussian,  with  covariance  operator  Ry:  E'  E,  Ry 


Lemma  1  [13],  [7]: 

(1)  9^  =  where  Hj^xHy  ^  ExE  is  the  natural 

injection.  ^  is  the  identity  in  ExE,  and  t  is  a  self- 
adjoint  bounded  linear  operator  in  Hj^xHy  with  lltll  <  1. 

(2)  t(x,y)  =  (Vj^y,  V^x) ,  where  Hy  is  a  bounded 

linear  operator  with  HV^^II  <  1.  The  operator  is 

uniquely  defined  by  J*j,<u,x><v,y>dp^(x,y)  =  Jx^xY'’yV> 
for  all  u, V  in  E' . 

(3)  ^  ^  if  and  only  if  is  Hilbert-Schmidt  with 

IIVj^ll  <  1. 

(4)  When  Vj^  is  Hilbert-Schmidt  with  HVj^ll  <  1,  then  iC^xy) 

=  -2^  log  (l-'T  )  where  (nr  )  are  the  eigenvalues  of  W 

n.  II  n  Xl  XY 


Lemma  1  is  fundamental  to  the  solution  of  the  channel  capacity 
problem.  It  enables  one  to  calculate  the  mutual  information,  yielding 
the  following  result. 
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Lenima  2  [2],  [7]:  Suppose  that  is  Gaussiain.  Then: 

(1)  <  “  if  and  only  if  p^[range( j^^)]  =  1,  where  is 

_ Hy 

the  extension  of  p^^  to  cr(E')  ; 

(2)  I(l^)  <  “  if  and  only  if  Rj^  =  f'  ^  ^  %  f® 

trace-class.  When  this  is  satisfied, 

=  2  2^  log  (l+T^),  where  (t^)  are  the  eigenvalues  of  T. 

If  the  RKHS  is  not  separable,  then  part  (1)  of  Lemma  2  holds 
with  the  condition  p^[range( jj^)]  =  1  replaced  by  p^[range( j^^)]  =  1, 

where  p^^  is  the  outer  measure  obtained  from  p^  E'f]*  The  following 
result  is  then  useful. 

Lemma  3  [7]:  Suppose  that  E  is  a  locally  convex  l.t.v.s.,  p  a 

probability  measure  on  (E.cr(E*)).  Suppose  that  B  is  a  separable 
or  reflexive  Banach  space  and  that  j:  B  E  is  a  continuous 
linear  injection.  Then,  the  following  are  equivalent: 

(1)  Aj(B)]  =  1: 

(2)  p  =  uoj  where  u  is  a  unique  probability  measure  on 

(B.a(B-)). 

If  (1)  or  (2)  holds,  then  p  is  Gaussian  if  and  only  if  v  is 
Gaussian.  If  B  is  both  separable  and  reflexive,  then 
j[B]  €  cr(E' so  that  (1)  is  equivalent  to  p[j(B)]  =  1. 

In  the  mismatched  channel  to  be  considered  subsequently,  the 
constraints  are  given  in  terms  of  the  norm  for  another  Hilbert  sub¬ 
space  of  E.  The  following  result  is  then  useful.  It  does  not  require 
that  1^  be  separable. 
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Proposition  1:  Suppose  that  is  a  Hilbert  subspace  of  E.  Let 

j'w'  E  be  the  natural  injection  map,  and  suppose  that 
p^[range(j^)]  =  1.  Then  <  «>  if  and  only  if  is  a  vector 
subspace  of  If  C  Hj^,  then  is  continuous,  the  natural 
injection  J:  H^  is  continuous,  and  is  the  RKHS  for  the 
covariance  operator  if  Hj^  is  separable,  then  is  also 
separable. 


Proofs  Suppose  that  C  Since  is  a  Hilbert  space  contained 
in  the  RKHS  Hj^,  H^  is  also  a  RKHS  of  functions  on  E'  and  lljxll 
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<  kllxll^  for  all  X  in  H^,  some  k  <  «>  [12],  so  that  the  natural 

injection  J:  H^  -♦  Hj^  is  continuous.  Since  j^  =  jj^J,  j^  must  also  be 
,  ^ 

continuous,  so  that  J^j^  is  a  covariance  operator  mapping  E'  -*  E.  By 
definition,  H^  is  the  (unique)  RKHS  for  j^j^.  To  see  that  H^  is  sep¬ 
arable  (assuming  that  Hj^  is  separable),  one  notes  that  the  linear  map 
L:  ^  LJ^u  =  is  continuous  and  has  dense  reinge  in  so 

that  L  has  only  {0}  in  its  null  space.  Thus,  if  {u^,  n>l}  is  such 
that  n>l}  is  dense  in  Hj^,  then  (Ljj^u^,  n>l}  must  be  dense  in 

^(m^)  <  by  Lemma  2,  since  [range(jj^)]  =  1. 

If  H^  is  not  contained  in  Hj^,  then  there  exists  z  in  r£inge(j^), 
z  €  range(jj^).  The  Gaussian  measure  with  covariance  z®z  has 

[range(jj^)]  =  0;  by  Lemma  2,  I(m^)  =  “•  ^ 


Constraints 

The  constraints  that  will  be  used  to  define  the  admissible  set  Q 
of  message  measures  are  the  following: 

(A-l)  range  (j^)]  =  1. 
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(A-2)  J  llxll^dUj^Cx)  <  P. 

“w 

where  is  a  Hilbert  space  contained  in  with  norm  11*11^, 

E  is  the  natural  injection,  and  i)^  is  the  Borel  measure  on 
satisfying 

Since  we  wish  to  have  the  constraint  (A-2)  apply  a.e.  dp^^.  it  is 
first  necessary  to  require  (A-1).  The  existence  of  the  measure 

A 

such  that  Pj^  =  follows  from  Lemma  3;  H^  is  separable,  from 

Proposition  1.  Also  by  Proposition  1,  the  capacity  will  be  infinite 

if  is  not  a  vector  subspace  of 

The  constraint  {A-2)  is  motivated  by  the  typical  application 

when  E  is  L2[0,T].  In  the  case  of  formal  white  noise,  the  constraint 
T  2 

is  usually  E  X^((j)dt  <  P.  This  can  be  viewed  as  a  constraint  on 
2 

EIIXII^,  where  W  is  the  RKHS  of  the  identity  operator:  this  is  the 

covariance  of  formal  white  noise.  When  white  noise  is  viewed  as  the 

formal  derivative  of  the  Wiener  process,  then  the  "integrated" 

channel  is  analyzed  [5].  In  that  case,  the  transmitted  signal  X  is 

defined  by  X^  =  u(s)ds,  u  in  L2[0,T],  and  the  constraint  is 
2  *2 

typically  EIIUII  <  PT.  Ilxll,  is  the  norm  of  x  in  the  RKHS  of  Wiener 
2  ^2 

measure.  Finally,  one  may  note  that  in  his  treatment  of  stationary 
power-and-frequency-limited  Gaussieui  channels  when  the  noise  has 
integrable  spectral  density  [4],  Gallager  first  assumes  a  constraint 
on  the  message  of  the  form  EIIXIIj^  <  PT.  However,  the  transmitted 

signal  is  obtained  by  passing  the  message  through  a  linear  filter 

whose  transfer  function  G  satisfies 


dX  <  “,  where  is  the 
lo. 

noise  spectral  density.  Such  a  transmitted  signal  satisfies  both 
(A-1)  and  a  constraint  of  the  type  A-2,  with  (assuming  that  |g|^/<#>j^ 
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is  bounded)  an  upper  bound  of  E^IIXII^  „  <  —  sup  I  dX  for  any 

2ir  X 

T  >  0,  where  now  X  refers  to  the  filtered  message  and  ll•ll»T  -p  is  the 
RKHS  of  the  noise  covariance  for  the  interval  [O.T].  Of  course,  the 
constraint  (A-2)  is  not  placed  explicitly  on  the  transmitted  signal 
in  [4]:  instead,  it  appears  in  the  solution  for  the  capacity. 
Gal lager’s  analysis  is  for  the  water-filling  model  treated  by  Fano 
[6].  Fano’s  treatment  does  not  yield  finite  capacity,  precisely 
because  the  constraints  (A-1)  and  {A-2)  are  not  imposed. 

In  addition  to  its  use  in  previous  more  specialized  analyses, 
the  use  of  a  Hilbert  space  norm  is  plausible  from  two  other  consid¬ 
erations.  First,  as  can  be  seen  from  Lemma  2.  the  capacity  will  be 
infinite  unless  the  constraint  used  implies  E^IIXII^  <  P'  for  some 
P'  <  “.  Proposition  1  shows  that  must  be  a  RKHS  of  functions  on  E' 
if  the  capacity  is  to  be  finite.  Second,  a  RKHS  norm  actually  places 
a  dual  constraint  on  the  signal;  this  corresponds  to  limitations  on 
the  amount  and  frequency  distribution  of  the  signal  energy  in  typical 
applications . 

The  capacity  subject  to  the  constraints  (A-1)  and  (A-2)  will  be 
denoted  by  ‘6^(P).  If  H^  =  Hj^  (consisting  of  the  same  elements  and  the 
identical  inner  product),  then  the  capacity  will  be  denoted  by  ‘^(P) 
and  the  channel  is  said  to  be  matched  (to  the  constraint).  If  H^  ^  H^^ 
as  Hilbert  spaces,  then  the  channel  is  said  to  be  mismatched.  It  will 
be  seen  that  in  the  matched  case,  the  results  C2ui  be  directly  related 
to  results  obtained  by  Shannon  [1]  and  Gal lager  [4]  for  the  white 
noise  case  and  by  Kadota,  ZaRai ,  and  Ziv  [5]  when  the  noise  is  the 
Wiener  process  (without  a  dimensionality  constraint).  Thus,  these 
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results  extend  the  aforementioned  results  to  an  arbitrary  Gaussian 
noise,  rather  than  for  formal  white  noise  or  for  the  Wiener  process. 

In  the  case  of  the  mismatched  channel,  a  completely  new  set  of 
results  is  obtained.  These  results  differ  from  those  of  the  matched 
channel  not  only  in  the  value  of  the  capacity,  but  also  by  the 
properties  of  the  solution.  These  differences  will  be  discussed  after 
the  main  results  are  presented. 

Information  Capacity  of  the  Matched  Channel 

The  solution  for  is  given  by  the  following  theorem. 

Theorem  1  [2],  [7]: 

(a)  Suppose  that  is  of  dimension  >  M,  and  that  (in  addition 

to  (A-1)  and  (A-2)),  is  required  to  satisfy 

dim  [supp{p^)]  <  M.  Then  =  (M/2)  log  (1  +  P/M).  The 

supremum  is  attained,  and  only  attained,  when  is 

Gaussian  with  zero  mean  eind  covariance  operator 

where  T:  is  any  self-adjoint  linear  operator  with 

M-dimensional  range  space  and  with  a  single  non-zero 
eigenvalue  of  value  P/M. 

(b)  Suppose  that  is  infinite-dimensional.  Subject  only  to 

the  constraints  (A-1)  and  (A-2),  ‘^(P)  =  P/2.  The  capacity 
cannot  be  attained. 

Shannon's  original  work  [1]  considered  capacity  for  the  white 
Gaussian  channel  with  noise  of  spectral  density  Nq/2  and  the  signal  S 
constrained  in  time  to  T  seconds,  constrained  in  bandwidth  to  W 
hertz,  and  constrained  in  average  power  by  Ej^S^dt  <  PT.  His  result, 
one  of  the  best-known  results  of  early  information  theory,  was  that 
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the  capacity  is  WT  log  (1  +  P/(WNq)).  Since  the  formal  RKHS  norm  for 

such  noise  is  llxll^  =  J’Jx^dt/(NQ/2) ,  Theorem  1(a)  gives  Shannon’s 

result  by  setting  the  dimensionality  M  =  2WT.  Gallager  [4]  has 

obtained  this  result  for  the  white  noise  channel  by  considering  the 

signal  as  a  point  in  a  space  of  2WT  dimensions,  and  applying  the  sEime 

energy  constraint  as  that  used  by  Shannon. 

Part  (b)  of  Theorem  1  is  also  a  generalization  of  known  results 

for  the  white  noise  case,  again  generalizing  results  of  Shannon  and 

Gallager.  For  white  noise  of  spectral  density  Nq/2  and  signal  S  such 
T  2 

that  E/^S^dt  <  PT,  Shannon  showed  that  the  capacity  without  a  band¬ 
width  constraint  is  at  least  P/Nq.  Gallager  proved  that  the  capacity 
is  P/Nq  if  there  is  no  dimensionality  constraint  on  the  signal. 

The  result  of  Theorem  1(b)  has  also  been  obtained  by  Kadota, 
Zakai,  and  Ziv  [5]  for ‘the  case  where  the  channel  noise  is  the  Wiener 
process  (the  "integrated  white  noise"  channel). 

Theorem  1  thus  extends  some  of  the  well-known  results  of  infor¬ 
mation  theory,  previously  obtained  only  for  the  formal  white  noise 
channel  (or,  for  part  (b) ,  the  Wiener  process  channel)  to  a  general 
Gaussian  channel  without  feedback.  The  results  of  Theorem  1  hold  for 
any  Gaussian  noise  measure  whose  covariance  operator  maps  into  E  and 
which  has  a  separable  RKHS.  In  particular,  they  hold  whenever  the 
space  E  is  a  separable  Banach  space. 

Capacity  of  the  Mismatched  Gauss izm  Channel 

(as  Hilbert 

spaces)  provides  a  degree  of  flexibility  which  is  lacking  in  the 
matched  channel.  By  Lemma  2,  necessary  and  sufficient  conditions  for 
finite  capacity  are  that  p^[range( j^^)]  =  1  and  E^UXII^  <  P'  for 


Ihe  capacity  problem  for  the  case  where  H^ 


some 
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P'  <  “.  However,  one  may  wish  to  use  more  selectivity  in  the  choice 

2 

of  constraint.  As  noted  above,  the  RKHS  norm  lIxIL  can  be  viewed  as  a 

w 

constraint  on  both  the  amount  and  the  frequency  distribution  of  the 
signal  energy. 

In  the  matched  case,  the  constraint  <  P  constrains  the 

frequency  content  of  the  energy  in  a  manner  determined  solely  by  the 
channel  noise.  It  is  more  desirable  to  constrain  the  frequency 
distribution  in  a  manner  which  not  only  satisfies  constraints  imposed 
by  the  RKHS  of  the  channel  noise,  but  also  satisfies  additional 
constraints.  Such  additional  constraints  are  needed  in  order  to 
analyze  channels  with  partially  unknown  noise,  including  jamming 
channe 1 s . 

In  this  section,  the  results  corresponding  to  Theorem  1  will  be 
given,  now  assuming  a  mismatched  channel.  For  more  details,  reference 
is  made  to  [3],  where  the  corresponding  results  are  obtained  for  the 
case  where  E  is  a  separable  Hilbert  space.  Those  results  can  be 
extended  to  E  a  separable  complete  metric  space  by  using  the  Banach- 
Mazur  theorem  and  Kuratowski’s  Borel  mapping  theorem.  However,  in 
applications  one  may  deal  with  a  linear  space  that  is  not  metrizable, 
or  not  complete,  or  not  separable.  The  extension  of  those  results  to 
the  present  framework  can  be  carried  out  by  using  Lemmas  1-3  and 
Proposition  1  to  adapt  the  proofs  given  in  [3]. 

The  constraints  (A— 1)  and  (A-2)  involve  the  unique  Gaussiaui 
Borel  measure  on  H^  satisfying  Of  course,  has  a 

covariance  operator  R^:  H^  H^,  and  R^^  =  At  the  same  time. 

^  covariance  operator  (in  Hj^)  of  the  Gaussian 

measure  satisfying  ~  hi'  ^  ^  ^ 
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in  and  =  2  log  (1+t^).  Moreover,  u,p  = 

the  imbedding  map,  and  so 


H, 


From  this  it  follows  that  one  can  assume  that  range(J)  ^  =  H 


N 


(otherwise,  restrict  attention  to  range(J)  ).  One  then  has  R 

=  J  ^TJ  and  the  constraint  (A-2)  becomes  2  t  IIJ~^u  11^  <  P.  Define 

n  n  n,  W 

I„  +  s  =  (J-bV‘  =  (j/)"'  (3) 

where  Ij^  is  the  identify  in  Hj^.  The  operator  S:  -»  Hj^  is  densely 

defined.  The  limit  points  of  the  spectrum  of  S  consist  of  all  eigen¬ 
values  of  infinite  multiplicity,  all  limit  points  of  distinct  eigen¬ 
values,  and  all  points  of  the  continuous  spectrum  [13].  Let  0  be  the 
smallest  limit  point  of  the  spectrum  of  S,  and  let  {X  ,  n>l>  be  the 
set  of  all  eigenvalues  of  S  which  are  strictly  less  than  6,  with 
corresponding  o.n.  eigenvectors  {e  ,  n>l}.  Of  course,  {X  ,  n>l>  can 
be  empty,  finite,  or  countably  infinite. 


The  capacity  problem  now  becomes  that  of  determining 


subject 

=  sup  (t)  ^  log  (1  + 

to  the  constraint 

(4) 

L  ’'n  ^ 

(5) 

where 

II 

+ 

(6) 

+ 

II 

1 

iV 

“x  =  I  ■"nJN(“n®“n)JN  • 

(8) 

where  (t^)  is  any  nonnegative  summable  sequence  of  real  numbers, 
{u^,  n>l}  any  CONS  in  which  belongs  to  range(J). 

The  results  for  the  mismatched  channel  which  correspond  to  part 
(1)  of  Theorem  1  can  now  be  stated. 
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Theorem  2  [3]:  Suppose  that  Hj^  has  dimension  M  <  “.  The  capacity  is 
then 


K 


‘e^(P)  =  (t)  2  log 
n=l 


.+P+K 


K(l+0  )■! 


n' 


where  ^ 


.  <  /3jj  are  the  eigenvalues  of  S,  and  K  is  the 
largest  integer  <  M  such  that  2^/3^  +  P  >  K/3j^.  The  capacity  is 
attained  by  a  Gaussian  with  covariance  operator  (8),  where 

^  n  <  K,  ''’n  ~  ®  n  >  K,  and 
{u^.  n>l}  are  o.n.  eigenvectors  of  S  corresponding  to  the 
eigenvalues  (P^) ■  No  other  Gaussian  can  attain  capacity.  The 
same  result  is  obtained  if  has  dimension  L  <  “  and  is 
constrained  to  have  support  of  dimension  M  <  L. 


The  above  result  assumes  to  be  finite-dimensional.  The 
following  theorem  extends  to  the  case  where  is  infinite¬ 
dimensional,  but  the  support  of  is  restricted  to  be  of  finite 
dimension. 


Theorem  3  [3]:  Suppose  that  H2  is  infinite-dimensional,  and  that 
support  (f^)  is  restricted  to  have  dimension  <  M  <  “>. 

(1)  Suppose  that  0  <  <». 

(a)  If  {X^,  n>l}  is  empty,  then 

‘e^(P)  =  (M/2)  log  [1  +  PM“^(l+0)“^].  Capacity  can  be 
attained  if  and  only  if  S  has  0  as  an  eigenvalue  of 
multiplicity  >  M.  In  this  case  is  attained  only 

by  a  Gaussian  with  covariance  (8),  where 

Tj  =  PM  ^(1+0)  ^  for  i  <  M  with  {u^ . Uj^}  any  o.n. 

set  in  the  null  space  of  S  -  01. 
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(b)  If  KXj^  <  2^.  +  P  <  for  some  K  <  M,  then  the 

capacity  is  as  in  Theorem  2.  with  =  X^.  i=l,...,K, 

and  can  be  similarly  attained. 

(c)  Let  K  =  min(L,M),  where  L  >  1  is  the  number  of 
eigenvalues  (X^)  of  S  whose  value  is  strictly  less 
than  0,  and  suppose  that  P  +  2^Xj  >  K0.  The  cajjacity 
is  then 


K 

<€^(P)  =  i  2  log 
n=l 


1+0 


1+X 


+  {M/2)  log 


n-' 


K 


1  + 


P  +  2  (X.-0) 
n=l  ^ 


M(l+0) 


The  capacity  can  be  attained  if  and  only  if  0  is  an 

eigenvalue  of  S  with  multiplicity  >  M-K.  The  capacity 

is  then  achieved  only  by  a  Gaussian  with  covariance 

(8),  where  =  (2^^  +  P  -  MX^  +  (M-K)0)(1+X^)”^M~^ 

for  n<K.  with  Su^  =  X^u^  and  {Uj,....Uj^}  an  o.n.  set; 

and  with  u^  =  v^  and  =  (P  +  2^^  -  K0)M~^(l+0)“^ 

for  K+1  <  n  ^  M,  where  Sv  =  0v  and  v^., .... ,v„  is  an 

n  n  K+ 1  M 

o.n.  set. 

(2)  If  0  =  00^  then  "^(P)  bas  the  value  given  in  part  1(b),  and 
can  be  similarly  attained. 


Theorems  2  and  3  together  are  parallel  to  part  1  of  Theorem  1. 
The  solution  for  the  mismatched  channel  is  seen  to  be  considerably 
more  complex  than  that  for  the  matched  channel.  The  final  generali¬ 
zation  is  to  permit  to  have  infinite-dimensional  support.  The 
solution  will  again  be  much  more  complex  than  the  solution  to  the 
corresponding  problem  for  the  matched  channel  (part  2  of  Theorem  2). 
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Theorem  4  [3] : 

(1)  Suppose  that  0  <  <»,  is  infinite-dimensional,  and 

dim[supp(fj^)]  is  not  constrained. 

(a)  If  {A^,  n>l}  is  not  empty,  and  2^(0-X^)  <  P,  then 


(2) 


=  t  X.  los 


1+0 


1+X 


P+2  (A  -0) 
m^  m  ^ 


sequence , 


K 

‘e^(P)  =  t  2  log 
n=l 


n-*  1  + 

0 

not  empty, 

^  n-' 

is  an  infinite 

»  <  2  (0-X  ). 
n'*  n^ 

then 

there 

exists  a 

K  such  that 

2?A. 

1  1 

+  P  > 

KXj^,  and 

A.+P+Ki 
1  1 


K(l+X  ) 


(c)  If  {A  ,  n>l}  is  empty,  then  U„(P)  =  — ^ -  . 

"  ^  2(1+0) 

(d)  In  (a),  the  capacity  can  be  attained  if  and  only  if 

2n(0“^n)  “  then  attained,  and  only  attained, 

by  a  Gaussian  with  covariance  operator  as  in  (8), 
where  u^  =  e^  and  =  (0-A^)(l+A^)"^  for  all  n  >  1. 
In  (b) ,  the  cap)acity  can  be  attained  by  a  unique 
Gaussizin  with  covarieince  operator  (8),  where 


2^.+P+K 

u  =  e  and  t  =  - - - 1  for  n  <  K;  t  =0  for 

K(l+A^) 

n  >  K.  In  (c),  the  capacity  cannot  be  attained. 

If  0  =  «>,  then  ^(P)  has  the  value  given  in  part  1(b),  and 
can  be  similarly  attained. 


Discussion 

The  results  summarized  in  Theorems  1-4  provide  a  general 
solution  to  the  capacity  problem  for  the  Gaussian  channel  without 
feedback,  requiring  a  minimal  set  of  assumptions.  The  solution  for 
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the  mismatched  channel  is  markedly  different  from  that  for  the 
matched  channel.  The  value  of  the  capacity  can  be  very  different,  as 
already  seen;  it  can  be  less  or  more  than  the  capacity  of  the  matched 
channel,  depending  on  0  eind  The  expression  for  the 

capacity  varies  as  a  function  of  P,  and 
of  attaining  capacity  is  much  more  significant.  Even  in  the 
finite-dimensional  channel  the  vectors  Uj,...,Ujj  must  be  a  specific 
set  of  vectors,  not  just  any  o.n.  set.  If  is  infinite-dimensional 
with  dim[supp(pj^)]  <  M,  the  situation  is  even  worse  in  (a)  and  (c)  of 
Theorem  3.  That  is,  capacity  can  then  be  attained  only  if  S  has  0  as 
an  eigenvalue  of  multiplicity  >  M  when  S  >  01,  or  of  multiplicity 
>  M-K  when  S  has  K  <  M  eigenvalues  Xj  <  .  .  .  <  <  0  and 

P  +  2?X.  >  K0. 

1  1  ~ 

For  the  infinite-dimensional  channel  without  a  constraint  on 
dim[supp(p^)] ,  there  can  again  be  significant  differences  between 
•e^CP)  and  ‘6f»f(P).  depending  on  {0;  X^,  n>l}.  Moreover,  there  is  again 
a  rather  different  situation  in  the  problem  of  attaining  capacity. 
‘gjjCP)  can  never  be  attained:  ‘^(P)  can  be  attained  if  and  only  if 
{X  ,  n>l}  is  not  empty  and  P  <  2  (0-X  ). 

A  comparison  of  the  value  of  the  capacity  ‘S^CP)  for  the  mis^ 
matched  channel  with  the  capacity  ‘^(P)  of  the  matched  channel  can  be 
made  from  the  preceding  results.  For  the  finite-dimensional  channel, 
‘6^(P)  is  strictly  greater  them  “^(P)  if  <0,  or  if  P  +  <  0. 

'^(P)  ^  0  K  K  Pjj.  For  the  infinite-dimensional  channel, 

suppose  that  {X^.  n>l}  is  empty.  Then,  “^(P)  >  ‘^(P)  if  0  <  0, 
'^(P)  <  '^(P)  if  0  >  0,  '6^(P)  =  <ej^(P)  if  0  =  0.  If  {X^.  n>l}  is  not 
empty,  then  for  the  unconstrained  chainnel  '^(P)  is  greater  than 


H^.  Moreover,  the  problem 


i 
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P/[2(l+0)].  Thus,  ‘6^(P)  >  ‘6^(P)  if  0  ^  0  and  n>l}  is  not  empty. 

A  similar  result  can  be  obtained  for  the  constrained  channel. 

Applications 

The  results  on  the  mismatched  channel  can  be  used  to  analyze 
channels  with  jsartially  unknown  noise,  including  jamming  channels. 
The  results  can  also  be  applied  to  compare  the  capacity  of  channels 
with  and  without  feedback.  It  has  been  possible  to  show  that  the 
capacity  of  a  large  class  of  mismatched  Gaussian  channels  is 
increased  by  adding  linear  feedback  [16];  for  example,  this  class 
includes  the  time-discrete  correlated  noise  channel  with  a  pure  power 
constraint.  This  illustrates  another  difference  between  matched  euid 
mismatched  channels:  it  is  not  possible  to  increase  capacity  of 
matched  Gaussian  channels  by  adding  feedback. 


i 
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